Instance Selection to Improve Gamma Classifier
نویسندگان
چکیده
Pre-processing the dataset is an important stage in the Knowledge Discovery in Datasets (KDD) process. Filtering noise through instance selection is a necessary task. With this, the risk to use misclassified and non-representative instances to train supervised classifiers is reduced. This study aims at improving the performance of the Gamma associative classifier, by introducing a novel similarity function to guide instance selection. The experimental results, over 15 datasets, include several instance selection methods, and their influence in the performance of Gamma classifier is analyzed. The effectiveness of the proposed similarity function is tested, obtaining good results according to classifier accuracy and instance retention ratio.
منابع مشابه
Instance Selection in the Performance of Gamma Associative Classifier
The Gamma associative classifier is among the most used classifiers of the alpha-beta associative approach. It had been used successfully to solve many Pattern Recognition tasks, including environmental applications. However, as most classifiers, Gamma suffers with the presence of noisy or mislabeled instances in the training sets. This paper evaluates the impact of using instance selection tec...
متن کاملEvaluation of Classifiers in Software Fault-Proneness Prediction
Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...
متن کاملFeature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets
Objective(s): This study addresses feature selection for breast cancer diagnosis. The present process uses a wrapper approach using GA-based on feature selection and PS-classifier. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer datasets. Materials and Methods: To evaluate effectiveness of proposed feature selection method, we ...
متن کاملImproving Cascade Classifier Precision by Instance Selection and Outlier Generation
Beside the curse of dimensionality and imbalanced classes, unfavorable data distributions can hamper classification accuracy. This is particularly problematic with increasing dimensionality of the classification task. A classifier that can handle high-dimensional and imbalanced data sets is the cascade classification method for time series. The cascade classifier can compound unfavorable data d...
متن کاملMLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Polibits
دوره 54 شماره
صفحات -
تاریخ انتشار 2016